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The RHMC Algorithm for 2 Flavours of Dynamical Staggered Fermions 

M. A. Clark and A. D. Kennedy'' 

'^School of Physics, University of Edinburgh, King's Buildings, Edinburgh, EH9 3JZ, United Kingdom 

We describe an implementation of the Rational Hybrid Monte Carlo (RHMC) algorithm for dynamical compu- 
tations with two flavours of staggered quarks. We discuss several variants of the method, the performance and 
possible sources of error for each of them, and we compare the performance and results to the inexact R algorithm. 



1. Introduction 

Computations with two flavours of dynamical 
staggered quarks are quite popular at present. 
There are a number of possible problems with 
such calculations such as flavour symmetry break- 
ing and non- locality of the square-root of the four- 
flavour action. In this investigation we shall ig- 
nore these and consider only the possible errors 
introduced through algorithmic approximations. 

We propose the use of the Rational Hy- 
brid Monte Carlo (RHMC) algorithm T^. This 
method is stochastically exact, in the sense that 
it is free from molecular dynamics (MD) stepsize 
errors. It is comparable to the usual R algorithm 
[2| in performance, but without the need for ex- 
trapolation in the MD stepsize 6t. 

2. Two Flavour Algorithms 

All Hybrid Molecular Dynamics (HMD) algo- 
rithms have the same underlying structure: a fic- 
titious momentum field is introduced, and the 
gauge field is integrated along classical trajecto- 
ries in fictitious time, interleaved with refresh- 
ment of the momenta from a Gaussian heatbath. 
When integrating Hamilton's equations the eval- 
uation of the fermionic contribution to the force 
acting on the gauge fields is the costliest part of 
generating full QCD gauge field configurations. 
Most algorithmic developments are techniques to 
calculate the fermionic force more efficiently. 

The desired probability distribution for the 
gauge flelds U with rif flavours of staggered 
fermions is 



P{U) e-^^'-^^ det[MiU)] 



nf/4: 



(1) 



where is the staggered fermion kernel and Sg 
is the gauge action. Thus for Uf = 2 flavours 
of fermion we require the square root of the 
fermion determinant. Choosing a suitable nor- 
malisation the spectrum of the staggered fermion 
kernel M is contained in the interval [e, 1], where 

2.1. The R algorithm 

The identity detAl"'/" = exptrln AI"'/^ al- 
lows us to express the determinant as a term in 
the action, and the number of flavours just ap- 
pears as a factor in front of this term, S'pp = 
-^trlnAl. In the R algorithm |21 the eval- 
uation of the force corresponding to this trace 
is performed stochastically, as computing it ex- 
actly would be prohibitively expensive. To do 
this without introducing 0{St) errors we intro- 
duce an auxilliary fleld that is evaluated at time 
(1 ~ ^'lo'^g each integration step. For two 

flavours this breaks time-reversal invariance, vi- 
olates Liouville's theorem, and leads to an irre- 
versible and non area-preserving algorithm: as 
such, it cannot be made exact by the inclusion 
of a Metropolis step. A detailed analysis of the 
errors in the probability distribution produced by 
this algorithm was presented in [3]. 

3. Rational Hybrid Molecular Dynamics 

RHMD uses a uses a rational approxima- 
tion to fractional powers a of a matrix. This is 
analogous to the use of polynomial approxima- 
tions introduced in |4I5I6| . but optimal (Cheby- 
shev) rational approximations give a much closer 
approximation over a given interval than the cor- 
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Figure 1. Comparison of minimax errors for opti- 
mal rational and polynomial approximating func- 
tions to x^^l"^ over the range [0.00003,1] (cor- 
responding to staggered mass parameter m — 
0.025) as a function of approximation degree. 



responding polynomials of similar degree (Fig.^). 
The best choice for this rational approximation 
would seem to be given by a relative minimax 
approximation: this has the property that it is 
guaranteed to deviate by at most an amount A 
relative to the correct value over a given interval, 
max£<a;<i |1 — a:"r(a:)| = A. The errors are of 
the same kind as those introduced by the use of 
floating-point arithmetic. The maximum error A 
falls exponentially with the degree of the rational 
approximation used. 

We approximate the determinant in as a 
Gaussian integral over a bosonic (pseudofermion) 
field (/), 

detA^"'/4 «dctr(X)-2 oc / g-'/'^K^O'*^ 



where ||1 - r(A^) A^"'/^ || < A in the spectral 
norm. The RHMD algorithm proceeds as follows 

• A momentum refreshment heatbath using 
Gaussian noise: P(t^) oc e'al'^l . 

• A pseudofermion heatbath using Gaussian 
noise: = [r (A4(C/))]"^ ^, where P(C) oc 
e~2l€l . Note that the inverse \lr{x) is it- 
self a rational function. 



• An MD trajectory consisting of Tjbt steps 
with Hamiltonian R = ^ Sq + 

<^tf(A^)0 with 111 - f(X)X"f/4|| < A in 
the spectral norm. 

This leads to an algorithm which has finite step- 
size errors of 0{8t^) and errors of 0(A) and 0(A) 
incurred from the use of rational approximations. 

Rational functions can be expressed as a prod- 
uct or as a partial fraction expansion. 



A^) = Co n 
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In partial fraction form the pseudofermionic 
force takes the form 



dH 
dU 



dU ' 



i=l 



where Xi = (-^ — f3i)~^4>- A multishift solver [S] 
can be used to compute all the Xi i^i ^ common 
Krylov space. The computational cost of gener- 
ating the appropriate Krylov space depends upon 
the smallest shift, and the only extra cost is that 
of updating the extra d — 1 solution vectors. 

4. Rational Hybrid Monte Carlo 

The RHMC algorithm is similar to RHMD but 
with the addition of a Metropolis accept/reject 
step. The acceptance probability for this is given 
by 



Pa 



min 1 , e 



det7W"f/4r(A^)2 



(2) 



where H = ^{tt 



S'g + 0V(7W)20. 
If we use a high enough degree rational approx- 
imation r such that A < 1 ulp (unit of least pre- 
cision for the floating point arithmetic used) in 
the computation of the pseudofermion heatbath 
and of SH we can ignore the explicit determinants 
in (|2J without introducing any systematic errors 
beyond the ever-present rounding errors. In prac- 
tice we find that it is sufficiently cheap to use the 
same (machine accuracy) rational approximation 
for the MD integration as well (A = A), although 
this is not logically required. 
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5. Noisy Rational Hybrid Monte Carlo 

RHMCN T* allows the use of a lower de- 
gree rational approximation r while keeping the 
algorithm exact. The algorithm replaces the 
Metropolis step with a Kennedy-Kuti (KK) [Sj 
noisy accept/reject step. A stochastic summa- 
tion is used to estimate the determinant ratio 
in (O. The acceptance probability is defined 
as Pacc ^ X± + \^Q{U,U') for U > U' and 
U < U' respectively, where Q{U, U') is an un- 
biased (noisy) estimator of the determinant ratio 
occurring in ^ and X± are parameters used to 
ensure the resultant probability distribution lies 
in the range [0, 1]. When = A_ = A the aver- 
age acceptance rate is (P) = 2A. 

6. Comparison of RHMC and R 

The use of multishift solvers in the implemen- 
tation of the rational algorithms results in a lower 
computational cost than might otherwise be ex- 
pected. On a 16^ X 32 lattice with /? = 5.26, 
m — 0.01, and St — 0.01 the time to perform one 
trajectory of length r = 0.5 on a single node Pen- 
tium 4 processor with no assembler optimisations 
is 274 minutes for the R algorithm. RHMC takes 
318 minutes when a degree 10 rational function 
is used. Although a single R trajectory is faster, 
RHMC compares very favourably when it is re- 
membered that an 0{St^) extrapolation ought to 
be carried out when using the inexact R algo- 
rithm. 

The computational cost of using RHMCN is 
unfavourable compared with RHMC. Although a 
lower degree rational approximation can be used 
this does not give a large benefit because of the 
efficacy of the multishift solver. To ensure negli- 
gible probability violations occur when A > 1 ulp 
the KK acceptance test becomes increasingly ex- 
pensive for large volumes and small quark masses. 
The cost of performing a single trajectory using 
the parameters given above with a degree 6 ratio- 
nal approximation and typical stochastic summa- 
tion parameters is 385 minutes. We found that 
setting A to give (P) = 70% gives as large an 
acceptance rate as feasible without incurring ex- 
cessive violations, but for large V and small m it 



was found necessary to reduce this to about 50%. 

7. Conclusion 

We have found that it is easy and cheap to com- 
pute rational powers of the staggered fermion ker- 
nel to within machine accuracy using Chebyshev 
rational approximations expressed as partial frac- 
tions, and applied using a multishift solver. This 
form of the RHMC algorithm thus enables the 
exact HMC algorithm to be extended to the case 
of an arbitrary number of flavours. The ability 
to use Krylov space solvers makes RHMC faster 
than the PHMC algorithm |4I5| . In terms of com- 
putational cost there is very little to chose be- 
tween the R and RHMC algorithms. Since this 
seems to be the only possible advantage of R over 
RHMC, we conclude that there is no reason for 
the continued use of the R algorithm. 
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